Installation

Install the library from pip or use github install to install the latest version

pip install tfds_defect_detection
In [ ]:
!pip uninstall tfds_defect_detection
!pip install git+https://github.com/thetoby9944/tfds_defect_detection.git@master -U
WARNING: Skipping tfds-defect-detection as it is not installed.
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting git+https://github.com/thetoby9944/tfds_defect_detection.git@master
  Cloning https://github.com/thetoby9944/tfds_defect_detection.git (to revision master) to /tmp/pip-req-build-juiuh_t8
  Running command git clone -q https://github.com/thetoby9944/tfds_defect_detection.git /tmp/pip-req-build-juiuh_t8
Requirement already satisfied: tqdm~=4.19 in /usr/local/lib/python3.7/dist-packages (from tfds-defect-detection==0.1.0) (4.64.1)
Requirement already satisfied: pydantic~=1.9 in /usr/local/lib/python3.7/dist-packages (from tfds-defect-detection==0.1.0) (1.9.2)
Requirement already satisfied: tensorflow~=2.9 in /usr/local/lib/python3.7/dist-packages (from tfds-defect-detection==0.1.0) (2.9.2)
Requirement already satisfied: albumentations~=1.2 in /usr/local/lib/python3.7/dist-packages (from tfds-defect-detection==0.1.0) (1.2.1)
Requirement already satisfied: scikit-image~=0.16 in /usr/local/lib/python3.7/dist-packages (from tfds-defect-detection==0.1.0) (0.18.3)
Requirement already satisfied: opencv-python-headless~=4.6.0 in /usr/local/lib/python3.7/dist-packages (from tfds-defect-detection==0.1.0) (4.6.0.66)
Collecting polygenerator~=0.2
  Downloading polygenerator-0.2.0-py2.py3-none-any.whl (5.8 kB)
Requirement already satisfied: matplotlib~=3.1 in /usr/local/lib/python3.7/dist-packages (from tfds-defect-detection==0.1.0) (3.2.2)
Requirement already satisfied: PyYAML in /usr/local/lib/python3.7/dist-packages (from albumentations~=1.2->tfds-defect-detection==0.1.0) (6.0)
Requirement already satisfied: numpy>=1.11.1 in /usr/local/lib/python3.7/dist-packages (from albumentations~=1.2->tfds-defect-detection==0.1.0) (1.21.6)
Requirement already satisfied: qudida>=0.0.4 in /usr/local/lib/python3.7/dist-packages (from albumentations~=1.2->tfds-defect-detection==0.1.0) (0.0.4)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from albumentations~=1.2->tfds-defect-detection==0.1.0) (1.7.3)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib~=3.1->tfds-defect-detection==0.1.0) (3.0.9)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib~=3.1->tfds-defect-detection==0.1.0) (1.4.4)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib~=3.1->tfds-defect-detection==0.1.0) (0.11.0)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib~=3.1->tfds-defect-detection==0.1.0) (2.8.2)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib~=3.1->tfds-defect-detection==0.1.0) (4.1.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib~=3.1->tfds-defect-detection==0.1.0) (1.15.0)
Requirement already satisfied: scikit-learn>=0.19.1 in /usr/local/lib/python3.7/dist-packages (from qudida>=0.0.4->albumentations~=1.2->tfds-defect-detection==0.1.0) (1.0.2)
Requirement already satisfied: networkx>=2.0 in /usr/local/lib/python3.7/dist-packages (from scikit-image~=0.16->tfds-defect-detection==0.1.0) (2.6.3)
Requirement already satisfied: PyWavelets>=1.1.1 in /usr/local/lib/python3.7/dist-packages (from scikit-image~=0.16->tfds-defect-detection==0.1.0) (1.3.0)
Requirement already satisfied: pillow!=7.1.0,!=7.1.1,>=4.3.0 in /usr/local/lib/python3.7/dist-packages (from scikit-image~=0.16->tfds-defect-detection==0.1.0) (7.1.2)
Requirement already satisfied: tifffile>=2019.7.26 in /usr/local/lib/python3.7/dist-packages (from scikit-image~=0.16->tfds-defect-detection==0.1.0) (2021.11.2)
Requirement already satisfied: imageio>=2.3.0 in /usr/local/lib/python3.7/dist-packages (from scikit-image~=0.16->tfds-defect-detection==0.1.0) (2.9.0)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn>=0.19.1->qudida>=0.0.4->albumentations~=1.2->tfds-defect-detection==0.1.0) (1.2.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn>=0.19.1->qudida>=0.0.4->albumentations~=1.2->tfds-defect-detection==0.1.0) (3.1.0)
Requirement already satisfied: packaging in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (21.3)
Requirement already satisfied: keras-preprocessing>=1.1.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.1.2)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (2.0.1)
Requirement already satisfied: protobuf<3.20,>=3.9.2 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (3.17.3)
Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.6.3)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.49.1)
Requirement already satisfied: flatbuffers<2,>=1.12 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.12)
Requirement already satisfied: wrapt>=1.11.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.14.1)
Requirement already satisfied: gast<=0.4.0,>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (0.4.0)
Requirement already satisfied: tensorboard<2.10,>=2.9 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (2.9.1)
Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (3.3.0)
Requirement already satisfied: tensorflow-estimator<2.10.0,>=2.9.0rc0 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (2.9.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (57.4.0)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (0.27.0)
Requirement already satisfied: absl-py>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.3.0)
Requirement already satisfied: h5py>=2.9.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (3.1.0)
Requirement already satisfied: keras<2.10.0,>=2.9.0rc0 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (2.9.0)
Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (0.2.0)
Requirement already satisfied: libclang>=13.0.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow~=2.9->tfds-defect-detection==0.1.0) (14.0.6)
Requirement already satisfied: wheel<1.0,>=0.23.0 in /usr/local/lib/python3.7/dist-packages (from astunparse>=1.6.0->tensorflow~=2.9->tfds-defect-detection==0.1.0) (0.37.1)
Requirement already satisfied: cached-property in /usr/local/lib/python3.7/dist-packages (from h5py>=2.9.0->tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.5.2)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (3.4.1)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.8.1)
Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.35.0)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (0.6.1)
Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.0.1)
Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (2.23.0)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (0.4.6)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (0.2.8)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (4.9)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (4.2.4)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.3.1)
Requirement already satisfied: importlib-metadata>=4.4 in /usr/local/lib/python3.7/dist-packages (from markdown>=2.6.8->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (4.13.0)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (3.9.0)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (0.4.8)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (2022.9.24)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (3.0.4)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.10,>=2.9->tensorflow~=2.9->tfds-defect-detection==0.1.0) (3.2.1)
Building wheels for collected packages: tfds-defect-detection
  Building wheel for tfds-defect-detection (setup.py) ... done
  Created wheel for tfds-defect-detection: filename=tfds_defect_detection-0.1.0-py3-none-any.whl size=95736 sha256=095f99440d06b49cf38e20181de3511d65ba8842f19118c122cdfc765cb82163
  Stored in directory: /tmp/pip-ephem-wheel-cache-gw78vdis/wheels/45/b7/d0/cd347013966d50d4c44c6062ef00ee6a491afa6c42351dd019
Successfully built tfds-defect-detection
Installing collected packages: polygenerator, tfds-defect-detection
Successfully installed polygenerator-0.2.0 tfds-defect-detection-0.1.0
In [ ]:
import tfds_defect_detection as tfd
import albumentations as A
from pathlib import Path

Examples

Start by loading the MVTEC dataset

In [ ]:
ds = tfd.load(names=["mvtec"], data_dir=Path("."))
Downloading data from https://www.mydrive.ch/shares/38536/3830184030e49fe74747669442f0f282/download/420938113-1629952094/mvtec_anomaly_detection.tar.xz
5264982680/5264982680 [==============================] - 277s 0us/step
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:34<00:00, 104.46it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:21<00:00, 80.66it/s]
Dataset shape: <PrefetchDataset element_spec=TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None)>
Uses 2903 of 3629 images from train_images
Here is the first batch

Disable download and preparation

After the first call, the files are cached and you can set download to False

In [ ]:
ds = tfd.load(names=["mvtec"], data_dir=Path("."), download=False)
Dataset shape: <PrefetchDataset element_spec=TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None)>
Uses 2903 of 3629 images from train_images
Here is the first batch

Disable the preview

In [ ]:
ds = tfd.load(names=["mvtec"], data_dir=Path("."), peek=False, download=False)

Enable synthetic anomalies

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    create_artificial_anomalies=True, 
    drop_masks=False
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 53773.51it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 48185.03it/s]
Dataset shape: <PrefetchDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 2), dtype=tf.float32, name=None))>
Uses 2903 of 3629 images from train_images
Here is the first batch

Include original + synthetic defect

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    pairing_mode="result_with_original", 
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 48606.36it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 50441.48it/s]
Dataset shape: <PrefetchDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None))>
Uses 2903 of 3629 images from train_images
Here is the first batch

Add some artificial defects

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    create_artificial_anomalies=True, 
    pairing_mode="result_with_original", 
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 38214.18it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 24277.64it/s]
Dataset shape: <PrefetchDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None))>
Uses 2903 of 3629 images from train_images
Here is the first batch

Include the defects masks for training

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    create_artificial_anomalies=True, 
    drop_masks=False,
    pairing_mode="result_with_original", 
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 65059.79it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 68067.57it/s]
Dataset shape: <PrefetchDataset element_spec=((TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None)), TensorSpec(shape=(None, 256, 256, 2), dtype=tf.float32, name=None))>
Uses 2903 of 3629 images from train_images
Here is the first batch

Include processed image and random image from same class - Contrastive Pairs

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."),
    pairing_mode="result_with_contrastive_pair", 
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 65337.39it/s]
Preparing

 mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 52271.23it/s]
Dataset shape: <PrefetchDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None))>
Uses 2903 of 3629 images from train_images
Here is the first batch

Contrastive Pairs with artificial defects

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    create_artificial_anomalies=True, 
    drop_masks=False,
    pairing_mode="result_with_contrastive_pair", 
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 22099.68it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 53009.60it/s]
Dataset shape: <PrefetchDataset element_spec=((TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None)), TensorSpec(shape=(None, 256, 256, 2), dtype=tf.float32, name=None))>
Uses 2903 of 3629 images from train_images
Here is the first batch

Data augmentation

Global augmentation is applied on the paired image, whereas process_transform is applied on the result image that.

In [ ]:
# import albumentations as A

ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    pairing_mode="result_with_original", 

    global_transform=A.Compose([
      A.RandomBrightnessContrast(),
      A.HueSaturationValue(),
    ]),

    process_deviation=A.Compose([
      A.ShiftScaleRotate(shift_limit=0.01, scale_limit=0.0, rotate_limit=1.5),
      A.Blur(blur_limit=3),
      A.RandomBrightnessContrast(),
      A.RandomGamma(),
      A.HueSaturationValue(),
    ]),
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 36963.60it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 38515.09it/s]
Dataset shape: <PrefetchDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None))>
Uses 2903 of 3629 images from train_images
Here is the first batch

Artifical Defect Transforms

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    pairing_mode="result_with_original", 

    global_transform=A.Compose([
      A.RandomBrightnessContrast(),
      A.HueSaturationValue(),
    ]),

    process_deviation=A.Compose([
      A.ShiftScaleRotate(shift_limit=0.01, scale_limit=0.0, rotate_limit=1.5, p=1),
      A.Blur(blur_limit=3),
      A.RandomBrightnessContrast(),
      A.RandomGamma(),
      A.HueSaturationValue(),
    ]),

    anomaly_composition=A.Compose([
      A.RandomRotate90(),
      A.Transpose(),
      A.ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.50, rotate_limit=45, p=1),
      A.RandomGamma(),
      A.OpticalDistortion(),
      A.GridDistortion(),
      A.RandomContrast(0.5, p=1),
    ]),

    create_artificial_anomalies=True,
    drop_masks=False,
)
/usr/local/lib/python3.7/dist-packages/albumentations/augmentations/transforms.py:1641: FutureWarning: RandomContrast has been deprecated. Please use RandomBrightnessContrast
  FutureWarning,
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 62988.33it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 60249.44it/s]
Dataset shape: <PrefetchDataset element_spec=((TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None)), TensorSpec(shape=(None, 256, 256, 2), dtype=tf.float32, name=None))>
Uses 2903 of 3629 images from train_images
Here is the first batch

Choose different image sizes

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    
    width=312, 
    height=312
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 29049.34it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 55962.98it/s]
Dataset shape: <PrefetchDataset element_spec=TensorSpec(shape=(None, 312, 312, 3), dtype=tf.float32, name=None)>
Uses 2903 of 3629 images from train_images
Here is the first batch

Choose splits

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    
    subset_mode="validation"
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 58913.51it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 43118.36it/s]
Dataset shape: <PrefetchDataset element_spec=TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None)>
Uses 725 of 3629 images from train_images
Here is the first batch

Access hand labelled true evaluation data

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    
    subset_mode="test",
    drop_masks=False,
    shuffle=False
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 64317.26it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 54241.16it/s]
Dataset shape: <PrefetchDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 2), dtype=tf.float32, name=None))>
Uses 1380 of 1725 images from test_images
Here is the first batch

Change the train/validation and test/holdout split

In [ ]:
ds = tfd.load(
    names=["mvtec"], 
    data_dir=Path("."), 
    
    validation_split=0.1,
    subset_mode="validation"
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 61653.46it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 59905.73it/s]
Dataset shape: <PrefetchDataset element_spec=TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None)>
Uses 362 of 3629 images from train_images
Here is the first batch

Include more data, i.e. the visual anomalies - VisA dataset

In [ ]:
ds = tfd.load(
    names=["mvtec", "visa"], 
    data_dir=Path("."), 
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 60814.46it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 58831.17it/s]
Downloading data from https://amazon-visual-anomaly.s3.us-west-2.amazonaws.com/VisA_20220922.tar

1929840640/1929840640 [==============================] - 46s 0us/step
Converting VisA to mvtec-style dataset
100%|██████████| 10821/10821 [00:32<00:00, 331.23it/s]
Preparing VisA_mvtec_style train_images
100%|██████████| 8659/8659 [00:14<00:00, 592.52it/s] 
Preparing VisA_mvtec_style test_images and test_masks
100%|██████████| 2162/2162 [00:14<00:00, 153.57it/s]
Dataset shape: <PrefetchDataset element_spec=TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None)>
Uses 9830 of 12288 images from train_images
Here is the first batch

More handlabelled data - including the VisA dataset

In [ ]:
ds = tfd.load(
    names=["mvtec", "visa"], 
    data_dir=Path("."), 
    
    subset_mode="test",
    drop_masks=False,
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 59513.33it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 41625.48it/s]
Converting VisA to mvtec-style dataset
Preparing VisA_mvtec_style train_images
100%|██████████| 8659/8659 [00:00<00:00, 51218.94it/s]
Preparing VisA_mvtec_style test_images and test_masks
100%|██████████| 2162/2162 [00:00<00:00, 59060.47it/s]
Dataset shape: <PrefetchDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 2), dtype=tf.float32, name=None))>
Uses 3109 of 3887 images from test_images
Here is the first batch

Full example of all parameters

In [ ]:
ds = tfd.load(
    names = ("mvtec", "visa"),
    data_dir=Path("."),
    pairing_mode = "result_with_contrastive_pair",  # "result_only", "result_with_original"
    create_artificial_anomalies=True,
    validation_split=0.2,
    subset_mode = "training",                       # "validation", "test", "holdout", None
    drop_masks=False,
    width=256,
    height=256,
    repeat=True,
    anomaly_size = None,

    global_transform=A.Compose([
      A.RandomBrightnessContrast(),
      A.HueSaturationValue(),
    ]),

    process_deviation=A.Compose([
      A.ShiftScaleRotate(shift_limit=0.01, scale_limit=0.0, rotate_limit=1.5, p=1),
      A.Blur(blur_limit=3),
      A.RandomBrightnessContrast(),
      A.RandomGamma(),
      A.HueSaturationValue(),
    ]),

    anomaly_composition=A.Compose([
      A.RandomRotate90(),
      A.Transpose(),
      A.ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.50, rotate_limit=45, p=1),
      A.RandomGamma(),
      A.OpticalDistortion(),
      A.GridDistortion(),
      A.RandomContrast(0.5, p=1),
    ]),

    batch_size=24,
    seed=123,
    shuffle=True,
    peek=True,
    download=True,
    image_validation=False,
)
Preparing mvtec train_images
100%|██████████| 3629/3629 [00:00<00:00, 61858.42it/s]
Preparing mvtec test_images and test_masks
100%|██████████| 1725/1725 [00:00<00:00, 55825.24it/s]
Converting VisA to mvtec-style dataset
Preparing VisA_mvtec_style train_images
100%|██████████| 8659/8659 [00:00<00:00, 56129.15it/s]
Preparing VisA_mvtec_style test_images and test_masks
100%|██████████| 2162/2162 [00:00<00:00, 61246.02it/s]
Dataset shape: <PrefetchDataset element_spec=((TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None)), TensorSpec(shape=(None, 256, 256, 2), dtype=tf.float32, name=None))>
Uses 9830 of 12288 images from train_images
Here is the first batch